Install this application on your home screen for quick and easy access when you’re on the go.
Just tap then “Add to Home Screen”
Install this application on your home screen for quick and easy access when you’re on the go.
Just tap then “Add to Home Screen”
Member rate £492.50
Non-Member rate £985.00
Save £45 Loyalty discount applied automatically*
Save 5% on each additional course booked
*If you attended our Methods School in the last calendar year, you qualify for £45 off your course fee.
Monday 1 to Friday 5 August and Monday 8 to Friday 12 August 2016
Generally classes are either 09:00-12:30 or 14:00-17:30
30 hours over 10 days
The present course is aimed at familiarizing students with principles of multilevel modelling and its implementation in R. The course manual is Gelman and Hill (2007) and the two main R packages used will be lme4 and nlme. Broadly speaking, we will build and estimate models that step-by-step get more complex, discussing each decision in this process. Although this is a methods course, each and every lecture will focus on how the statistical model is linked to possible theories or particular hypotheses with added focus on potential limitations and correct interpretation. After laying the basic foundations, during the second week we switch gears and work with more complex models (i.e. deep interactions, cross-classified models, longitudinal analysis). By the end of the course, participants are expected to be able to clearly argue why they use in their own research papers a multilevel model, which specification suits the research question and the data (including relatively complex questions and data structures), how the models are specified, what the results mean and how they can be integrated with previous research. Each lecture will discuss both theoretical principles and practical implementations, whereas the lab sessions are designated solely to issues related to practical implementation.
Zoltán Fazekas is a Postdoctoral Researcher at the Department of Political Science, University of Oslo.
He earned his PhD in political science at the Department of Methods in the Social Sciences, University of Vienna, where he was an Early Stage Researcher in the Marie Curie Initial Training Network in Electoral Democracy, ELECDEM.
Zoltán holds a BA in Economics, an MA in European Affairs and an MA in Political Science. His fields of interest are: comparative electoral behaviour, political psychology, and quantitative methods.
Using multilevel models became a trend in recent years, however the switch from educational sciences with students nested in different schools (or teachers) to how volatility of a party system might influence individual political decisions is not trivial. These models allow researchers to test various comparative hypotheses that previously were only tentative explanations in research carried out on separate country samples. Furthermore, the abundance of cross-national survey data (such as EES, ESS, CSES) increasingly invites researchers to use these models. Nevertheless, multilevel linear models have specific assumptions and their use is guided both by theoretical reasoning and data properties. Also, once a general understanding of multilevel models is acquired, they present themselves as an extremely valuable and versatile set of tools for complex questions. In this course we will focus on three core aspects of applied multilevel modelling: 1) properties and specification of multilevel models, 2) linking method with theories of heterogeneity, and 3) implementation in R.
The present course is set up as a two-week course. The first week is dedicated to the general principles of multilevel models and implementation, including varying intercepts varying slope multilevel linear models, with additional focus on uncertainty, prediction, and limitations. The second week is dedicated to more advanced topics that usually appear in applied research. Each day has a lecture component and a lab component and along the assigned readings, the end of the first week will feature an overview homework. We start out the second week by reviewing together this homework. We will work with multiple datasets (overwhelmingly survey data) throughout the whole course, step-by-step specifying more complicated models, or extending our models to accurately reflect the formulated theory.
The lab sessions (taught in R) accompany the course and we will go through examples of multilevel models in applied research. Moreover, we will extract, display and discuss quantities of interest and link them directly to the concepts covered during the lectures, and discuss how these should be reported in an academic paper and how this can be easily formatted and exported from R. The lab sessions can be described as “supervised individual/group work”.
After a brief discussion of the course logistics we will review principles of inference, linear regression and assumption violations on day one. We introduce examples of comparative research questions and hypotheses, and what sort of data and method requirements have to be met.
Day two is dedicated to nested data structures and what challenges these raise for pooled regression models. We review the alternatives such as the comparison of single group regressions run separately and pooled regression with cluster corrected standard errors and their limitations and start discussing properties of the variables that will be included in the multilevel models. In this case, the sources of variation (within and between group) are of specific interest.
Day three offers the methodological and statistical transition from pooled regression to multilevel modelling with varying intercepts and varying slopes. We focus on principles and assumptions of multilevel models, discussing the great benefits but also the possible limitations (both from a theoretical and statistical perspective). We will also spend quite some time on clarifying notation and the meaning of these terms (i.e. fixed and random effects, correlation between random effects, etc.).
On day four we will extend our models to include multiple predictors, discussing both data preparation (as in centering) and interpretation, with strong focus on cross-level interactions and comparative model fit evaluation.
The last day of the first week is designated to uncertainty, prediction, and power. The first two elements are quantities or procedures that are both important for contextualizing our inferences and for a better reporting of our results. Within the framework of multilevel models, the presence of random effects raises several considerations about how we calculate uncertainty around the estimates, or how do we present our results using new data for model based predictions.
Day six we kick-off with the review of the weekend homework assignment and then switch to generalized linear models. After a short overview of link functions and the different quantities that we are interested in (such as predicted probabilities of a particular outcome category), we will focus on multilevel models, including second level predictors and cross level interactions, for dichotomous variables and counts.
On day seven we analyze data where we have a more complicated nesting structure. As in education research where pupils attend a particular elementary school and then a particular high school, we can find these situations in many other research areas. Observations can be nested in both countries and years, or specific survey responses can be nested in individuals and different modes, and so on. We will discuss cross-classified and multiple membership models in order to accurately account for this nesting structure and evaluate hypotheses that are linked to multiple grouping units.
We dedicate day eight to deep interactions and poststratification, as an extremely useful approach for deriving sub-group level estimates (for example, geographic and demographic sub-categories) from data that is available only on a higher level (such as, a nationally representative survey). As in most cases researchers also have access to rich official statistics at sub-group levels (but not good quality attitudinal data), these combined in a multilevel framework can enhance the quality of estimates for sub-group levels in terms of attitudes and reported behaviors, even with relatively small sample sizes.
One rather specific, but still intuitive application of multilevel models appears in the case of modelling longitudinal data. In many cases, change (continuous or discontinuous) throughout time is of interest for researchers, and the multilevel framework offers extensive possibilities to accurately model this, easily incorporating time varying predictors for example. We extend the basic longitudinal models to be able to handle unbalanced data (variably spaced data), but also data with varying numbers of measurement occasions, as these problems often characterize real data stemming from surveys. These models will be the topic of day nine.
On the last day, after a summary, we will look into future directions and important extensions for applied research. Most notably, bulk of the quantitative comparative research uses survey data and thus response behavior and systematic cross-country variation these can be a real issue. We will discuss multilevel item-response models to account for these possible problems and review previous examples on how substantive findings might change without considering these effects. Finally, we introduce how to transition to Bayesian estimation for hierarchical models.
Students should be able to comfortably use R (or transition fast from STATA/SPSS, which is easy) and have a solid prior knowledge of linear regression, including a solid understanding of assumptions. We will do a brief review of linear regression, but that does not substitute in depth knowledge and experience with linear regression models (OLS and maximum likelihood). Three sample books that can help in reviewing these concepts are:
For R, students can consult many freely available resources, but a good book to accompany a systematic review is: Adler, J. (2010). R in a Nutshell: A Desktop Quick Reference. O'Reilly Media
Day | Topic | Details |
---|---|---|
Monday 1 | Introduction to multilevel models: theory and data requirements |
Equal split of 3 hours: 90min lecture, 90min lab |
Tuesday 2 | From complete or no-pooling to partial pooling |
Equal split of 3 hours: 90min lecture, 90min lab |
Wednesday 3 | THE model: Varying intercept and slope |
Equal split of 3 hours: 90min lecture, 90min lab |
Thursday 4 | Predictors and cross-level interactions |
Equal split of 3 hours: 90min lecture, 90min lab |
Friday 5 | Uncertainty, prediction, and power |
Equal split of 3 hours: 90min lecture, 90min lab |
Monday 8 | Multilevel generalized linear models |
Equal split of 3 hours: 90min lecture, 90min lab |
Tuesday 9 | Models for cross-classified data |
Equal split of 3 hours: 90min lecture, 90min lab |
Wednesday 10 | Deep interactions |
Equal split of 3 hours: 90min lecture, 90min lab |
Thursday 11 | Longitudinal models |
Equal split of 3 hours: 90min lecture, 90min lab |
Friday 12 | Future directions: transition to Bayesian estimation and multilevel item-response models |
Equal split of 3 hours: 90min lecture, 90min lab |
Day | Readings |
---|---|
Monday 1 |
Kass, R. E. (2011). Statistical inference: The big picture. Statistical science: a review journal of the Institute of Mathematical Statistics, 26(1), 1-9. Rainey, Carlisle. 2014. "Arguing for a Negligible Effect." American Journal of Political Science 58(4): 1083-1091 For review: Gelman and Hill (2007), Chapters 2, 3, and 4 |
Tuesday 2 |
Gelman and Hill (2007), Chapter 1, 11 King, Gary, and Margaret E Roberts. Early View (2014). How Robust Standard Errors Expose Methodological Problems They Do Not Fix, and What to Do About It, Political Analysis: 1-21. |
Wednesday 3 |
Gelman and Hill (2007), Chapter 12, 13 Steenbergen, M. R., & Jones, B. S. (2002). Modeling multilevel data structures. American Journal of Political Science, 46(1), 218-237. |
Thursday 4 |
Craig K. E. & Tofighi, D. (2007) “Centering predictor variables in cross-sectional multilevel models: A new look at an old issue.” Psychological Methods 12(2): 121-138. Pittau, M. G., Zelli, R., & Gelman, A. (2010). Economic disparities and life satisfaction in European regions. Social indicators research, 96(2), 339-361. If not already familiar, for review: Brambor, T., Clark, W. R., & Golder, M. (2006). Understanding interaction models: Improving empirical analyses. Political Analysis, 14(1), 63-82. Berry, W. D., Golder, M., & Milton, D. (2012). Improving tests of theories positing interaction. The Journal of Politics, 74(03), 653-671. |
Friday 5 |
Gelman and Hill (2007), Chapter 20, 21, and 24. Stegmueller, D. (2013). How many countries for multilevel modeling? A comparison of frequentist and Bayesian approaches. American Journal of Political Science, 57(3), 748-761. |
Monday 8 |
Gelman and Hill (2007), Chapter 5, 14, and 15. Bates, D. M. (2010). lme4: Mixed-effects modeling with R. URL http://lme4. r-forge. r-project. org/book. Chapter 6. |
Tuesday 9 |
Fielding, A., & Goldstein, H. (2006). Cross-classified and multiple membership structures in multilevel models: an introduction and review. DfES. Browne, W. J., Goldstein, H., & Rasbash, J. (2001). Multiple membership multiple classification (MMMC) models. Statistical Modelling, 1(2), 103-124. Bates, D. M. (2010). lme4: Mixed-effects modeling with R. URL http://lme4. r-forge. r-project. org/book. Chapter 2. |
Wednesday 10 |
Lax, Jeffrey, and Justin Phillips. 2009b. “How Should We Estimate Public Opinion in the States?” American Journal of Political Science 53(1): 107–21. Ghitza, Y., & Gelman, A. (2013). Deep interactions with MRP: Election turnout and voting patterns among small electoral subgroups. American Journal of Political Science, 57(3), 762-776. Wang, W., Rothschild, D., Goel, S., & Gelman, A. (2014). Forecasting Elections with Non-Representative Polls. International Journal of Forecasting. Forthcoming. |
Thursday 11 |
Singer, J. D., & Willett, J. B. (2003). Applied longitudinal data analysis: Modeling change and event occurrence. Oxford University Press, USA. Chapters 1, 3, 4, 5, 6, and 7. Bates, D. M. (2010). lme4: Mixed-effects modeling with R. URL http://lme4. r-forge. r-project. org/book. Chapter 3. Recommended: Steele, F. (2008). Multilevel models for longitudinal data. Journal of the Royal Statistical Society: Series A (Statistics in Society), 171(1), 5-19. Yang, Y., & Land, K. C. (2008). Age–Period–Cohort Analysis of Repeated Cross-Section Surveys: Fixed or Random Effects? Sociological methods & research, 36(3), 297-326. |
Friday 12 |
Gelman, A., & Shalizi, C. R. (2013). Philosophy and the practice of Bayesian statistics. British Journal of Mathematical and Statistical Psychology, 66(1), 8-38. Jackman, S. (2009). Bayesian analysis for the social sciences (Vol. 846). Wiley, Part I, Chapter 1. Stegmueller, D. (2011). Apples and oranges? The problem of equivalence in comparative research. Political Analysis, 19(4), 471-487. |
Latest version of R (at least R 3.2) with possibility to install packages (if needed) on the go.
Participants need to bring their own laptop with R installed.
In addition to mandatory readings, these pieces can be very useful both for alternative perspectives, applications, and further learning:
Introduction to Generalized Linear Modeling
Interpreting Binary Logistic Regression Models
Multilevel Regression Modeling
Advanced Topics in Applied Regression
Panel Data Analysis: hierarchical structures, heterogeneity and serial dependence
Age-period-cohort analysis
Introduction to Bayesian Inference